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Abstract 

This paper addresses the inference of spatial dependence in the context of a recently proposed 
framework. More specifically, the paper focuses on the estimation of model parameters for a class 
of generalized Gibbs random fields [1^, i.e., Spartan Spatial Random Fields (SSRFs). The problem 
of parameter inference is based on the minimization of a distance metric. The latter involves a 
specifically designed distance between sample constraints (variance, generalized "gradient" and 
"curvature") and their ensemble counterparts. The general principles used in the construction of 
the metric are discussed and intuitively motivated. In order to enable calculation of the metric from 
sample data, estimators for generalized "gradient" and "curvature" constraints are constructed. 
These estimators, which are not restricted to SSRFs, are formulated using compactly supported 
kernel functions. An intuitive method for kernel bandwidth selection is proposed. It is proved that 
the estimators are asymptotically unbiased and consistent for differentiable random fields, under 
specified regularity conditions. For continuous but non-differentiable random fields, it is shown 
that the estimators are asymptotically consistent. The bias is calculated explicitly for different 
kernel functions. The performance of the sample constraint estimators and the SSRF inference 
process are investigated by means of numerical simulations. 
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I. INTRODUCTION 



aitial Random Fields (SRF's) have a wide range of applications in subsurface hydrology 
22 1, petroleum engineering [ ^ , environmental data analysis mining explo- 

ration and reserves estimation JI?!, and environmental health among other fields. From 



the applications viewpoint, the main goals are first to characterize the spatial continuity of 
such processes, and then to exploit the continuity for spatial estimation (prediction) and 
simulation. A methodological problem of continuing interest is the inference of the random 
field parameters that characterize the spatial continuity from the experimental data. The 
latter are typically distributed on irregular sampling grids. 

This paper seeks to address the issue of random field inference within the context of a 
specific model, the Fluctuation-Gradient-Curvature (FGC) Spartan Spatial Random Field 
(SSRF), which was introduced in 1^. SSRFs result from a convolution of a kernel function 
with an underlying SRF that may include non-resolved fine-scale detail at length scales below 
A. The kernel function acts as a low-pass filter that suppresses the spectral component of 
fluctuations above a cutoff wavevector kc- The removed part corresponds to sub-resolution 
scales. We will denote sample averages with a horizontal bar over the averaged quantity, 
and ensemble averages with the mathematical expectation symbol, i.e., mx = E[Xa(s)]. 
Without loss of generality in the following it is assumed that mx = 0. 

Let the data be given by the measurements X* = {X^, . . . , X*}, of the scalar quantity X 
at the set of sampling locations Sm = {si, . . . s„} in the domain Qn ^ The area enclosed 
by the convex hull of fi„ is denoted by |fi„,|. We assume that the data can be modeled 
as a sample (realization) of Xa(s), which is a Gaussian, weakly stationary, isotropic FGC- 
SSRF. The isotropic assumption is not a major restriction, since under certain conditions 
the anisotropic parameters can be established and isotropic conditions can be restored by 



rotation and rescaling transformations 
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131, Q. The isotropic FGC-SSRF involves the 



parameter set 6 = {rio,'r]i,C,,kc): the scale coefficient rjQ determines the variance, the shape 
coefficient rji determines the shape of the covariance function, the characteristic length ^ 
determines the range of spatial dependence, and kc is a wavevector cutoff related to the 



121. 



resolution scale A 

Regarding parameter inference, the main idea introduced in and further elaborated 
here, is that the SSRF model parameters can be estimated by matching sample constraints 
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and their ensemble counterparts. The model parameters are determined by treating the 
sample constraints as estimators of the respective stochastic constraints. This perspective 
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23- 



relies on the validity of the ergodic hypothesis 

The constraint matching idea is similar to the standard approach, in which the experi- 
mental variogram is matched with various model functions to determine an optimal model 
of spatial continuity. However, there are significant differences between the two approaches. 
(1) In the SSRF approach the number of estimated constraints is small (four in the case of 
the FGC-SSRF). This is due to the efficient parametrization of spatial dependence in the 
FGC-SSRF, which is based on interactions instead of the covariance matrix. In contrast, 
variogram modeling attempts to match the entire functional dependence of the variogram 
function. (2) The FGC-SSRF includes a family of covariance functions that account for 
various types of spatial continuity ^jE^- Hence, in practice fitting the sample constraints 
with one SSRF model may be sufficient. In contrast, the experimental variogram is fitted 
with a number of model functions to determine the "optimal" spatial model. (3) The SSRF 
sample constraints focus on the short-range behavior of spatial continuity. This is motivated 
by two observations: in geostatistical applications, the long-range behavior can only be es- 
timated with significant uncertainty; in addition, it is known that the long-range behavior 
does not have a significant impact on optimal linear prediction in regions where the field is 



densely sampled 



2J|. (4) The computational complexity of SSRF constraint calculations. 



at least on regular grids, is 0{mn), where m is o{n) and depends on the kernel bandwidth, 
while for variogram calculations the respective complexity is at best O(nlogn) if tree-based 
structures are used, or 0{n^) using standard methods |l2l ]. 

The main results obtained in this paper include the following: (1) Generalized gradi- 
ent and curvature estimators are formulated in terms of kernel averages, and a consistent 
method for selecting the kernel bandwidths is proposed. The generalized gradient and cur- 
vature estimators have a wider scope than the FGC-SSRF model: They are defined for both 
differentiable and continuous (but non-differentiable) spatial models. In the differentiable 
case, the estimators are defined in terms of finite-difference approximations of the respective 
derivatives. In the continuous case, the finite differences are not divided by the corresponding 
length spacing (step) in order to obtain asymptotically well defined quantities. (2) Conver- 
gence properties for the generalized gradient and curvature estimators are proved. (3) The 
constraint-based parameter inference procedure introduced in is improved by adding 
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a constraint that eliminates the nonhnear dependence of the model variance on the FGC- 
SSRF parameters. (4) Numerical simulations establish the performance of the constraints 
estimators and the parameter inference procedure. 

The remaining of this paper is structured as follows: An introduction to the FGC-SSRF 
in continuum space is given in Section (jlTl. In Section ()111|) the definition of the sample 
constraints on regular grids is reviewed. This is followed by the definition of generalized 
stochastic constraints for the FGC-SSRF in Section pV|) . Generalized sample constraints 
for the gradient and curvature are defined in Section (j^)- Theorems establishing the conver- 
gence of the constraint estimators are stated and proved in Section (jVIj) . Subsequently, the 
parameter inference process developed in is reviewed and refined in Section ()V11|) . Fi- 
nally, numerical simulations are used to validate the estimators and the parameter inference 
process in Section ()VIII|) . 



II. REVIEW OF THE FGC-SSRF MODEL 



In general, a Gibbs random field has the following joint probability density function 

/.|A-.(s);«| = 52i{z|£p^. (1) 

where H[X\{s)] 0] is the energy functional, is a set of model parameters, and the constant 
Z{6), called the partition function is the p.d.f. normalization factor obtained by integrating 
exp {—H[Xx{s); 6]} over all the realizations of the SRF. The FGC p.d.f. in is determined 
from the equation: 

i/fgjX,(s); 0] = I rfs ifg, [X,(s); 0'] , (2) 

where 6' = {rji,^, kc), and Lfgc is the normalized (to r/o = 1) local energy density at point s. 
The functional hfgc [Xx{s); 0'] is given in the continuum by the following expression 

h,,4X,{sy,0'] = [X,is)f + r^,eNX,is)f + e [V^Xa(s)]', (3) 

The functional Q is permissible if the resulting covariance function is positive definite, i.e., 
if it satisfies Bochner's theorem Permissibility constrains the value of ?7i (see 
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The explicit, albeit non-linear, dependence of the p.d.f. on three physically meaning- 
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ful parameters, rio,rii,^, instead of three linear coefficients multiplying the terms [Xa(s)] 
[VXa(s)]^ and [V^Xa(s)]^ , simplifies the parameter inference problem and allows intuitive 
initial guesses for the parameters. 

The FGC model has a particularly simple expression in Fourier space. If the Fourier 
transform of the covariance function is defined by means of 



G'.;A(k;0) = J dre"^'^-^Gx;A(r;0), (4) 
then the energy functional in Fourier space is given by 

iffge[X,(s); 6] = I dkX,{k) G-)^{k; 6) X,(-k). (5) 

The interaction is diagonal in Fourier space, i.e., the precision matrix G^.\(k;6) couples 
only components with equal wavevectors. For a real- valued SSRF Xa(s) it follows that 
Xa(— k) = Xj(k). Since Bochner's theorem guarantees the non-negativity of the covariance 
spectral density, it follows from (jSJ that the energy is a non-negative functional. 
The covariance spectral density follows from the expression: 



QA(k) 



where Qx(k) is the Fourier transform of the coarse-graining kernel, which determines how 
the fiuctuations are cut off at the resolution scale A For isotropic SSRF's, Qx(k) has no 
directional dependence. In 12,^^, a kernel having a boxcar spectral density with a sharp 
wavevector cutoff at kc was used. This kernel leads to a band-limited covariance spectral 
density Gx-x{k] 6). 



III. GENERALIZED ENERGY FUNCTIONAL 

The energy density defined by Eq. is valid in the continuum, and for differentiable 
SRFs. Generalized versions of the functional that are valid on regular lattices can be defined. 
For example: 



5 



Definition 1 We define the local energy terms So, Si{ai), and S2{a2), Vs G L^, as follows: 



So = [Xxis)]\ S^ia,) = ^ [x,(s + aie,) - X,(s)] '/a? 



where is the centered second-order difference operator in the direction ej, i.e, 

A^^\Xxis)] = [Xa(s + a2 e,) + Xa(s - e,) - 2Xa(s)] /aj. 

So represents the square of the fluctuations, Si the square of the generalized gradient, and 
S2 the square of the generalized curvature. 

The generalized gradient and curvature terms above are expressed in terms of finite 
differences instead of derivatives. These terms replace the gradient and curvature in Q. 
On a hypercubic lattice L^^ C Z"' in c? dimensions with step a, one obtains ai = 02 = a. The 
sample counterparts of Si{s), i = 0, 1,2, obtained by replacing Xa(s) with X*(s), are thus 
well-defined even for non-differentiable SRFs. Parameter inference is based on matching the 



sample constraints, iSj(s), with the stochastic constraints, E[S'j(s)], as shown in |l2f]. 

Definition 2 The ensemble moments IE[S'o], E[S'i(ai)] and E[S'2(a2)] provide the SSRF 
model constraints. These can be expressed in terms of the variance Gx{0) and the semi- 
variogram function Fx as follows: 



E[So] = Ga(0), 



(7) 



.(1) 



Fxiai) 



E[52(a2)] := 



02(02) 



.(2) 



Fx{a2) - c'-^^ Fx{V2, 



a2 



.(1) 



Fxi2a2) 



"2 



(8) 

(9) 



where c^P = 2d, c 



(2) 



d^, and cf = 4:d{d - 1] 



Remark 1 The SSRF constraints are expressed in terms of the semivariogram Fa, but 
this does not imply that the experimental variogram is required for determining the spatial 
model. 
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The dependence of the stochastic constraints on the SSRF parameters 6 is not shown 
exphcitly to keep the notation concise. The stochastic constraints are well defined for the 
FGC-SSRF, which are differentiable if is finite In the case of continuous but non- 
differentiable models, the ratios 0i(ai)/a^ and (f)2{ci'2) / (^t '^'^^ ^^^^ defined when ai, 02 — *■ 
0. Then, the constraints are defined in terms of the quantities 0i(cii) and 02(02) respectively. 



IV. CONSTRAINT DEFINITIONS ON IRREGULAR GRIDS 

In most geostatistical applications the available sample is distributed on an irregular 
sampling grid. In order to infer the model parameters, suitable stochastic and sample-based 
constraints need to be defined. On regular grids the lattice symmetry leads to obvious 
choices for the distance increments (steps) ai and 02 and the finite differences. On irregular 
grids, we formulate the sample gradient and curvature constraints using kernel averages. We 
also define steps oi and 02, suitable for general sampling point distributions. In addition, the 
kernel bandwidths are selected so as to yield good asymptotic properties for the generalized 
gradient and curvature estimators. 



A. Stochastic FGC-SSRF Constraints 



The stochastic constraints are related to the SRF model and thus do not depend on 
the sampling point distribution. Hence, the constraints defined in ©-(jni) can be used for 
irregular grids as well. The dependence of the constraints on the SSRF parameters is made 
explicit using the spectral representation of the covariance function [25]. If we define v = 
{kciYi II(t>) = 1 + riiv + v'^^ and w = v^^"^^^^ then for any a > the following relations hold: 

Voe^'-' r. (.-2)/4 , , ^ I^a(^) r (.r,. 

The variance stochastic constraint, obtained for a = 0, is given by 

^i^oj-2,^,/2r(rf/2) X n(^) • ^^^^ 

In general, the dependence of E[S'o] on r/i and is nonlinear [3], i.e., = VoQdivi^ kd). 
The function QdijIii^cC) tends to an asymptotic finite bound as kd 00. The bound is 
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attained with very good accuracy if kc^ = q^, where qd is an 0(1) constant that depends on 
d. To ehminate the dependence of the SSRF variance on rji and kcC,, we impose the relation 



7io = 2'^n'^/'Tid/2)al 
which is equivalent to the following normalization constraint: 



E[S',] 



dvv 



niv) 



1. 



(12) 



(13) 



The stochastic constraint for the generalized gradient is given by 

^(1)^2 roo 



nSiiai)] 



n Jo 



dv 



«1 



\Qx{w) 



(14) 



where 7^ = (20"'/^"^r(d/2). 

Based on Q and (fTUI) . the stochastic constraint for the generalized curvature is given by 



E[5'2(a2)] 



2 /"OO 



dv 



*2 JO 



cyJd/2-i(a2w) - c 



n(t;) 



,d/2-l _ 



Id- 



,{d-2)/A 
d/2-l 



Jd/2-l{0'2V2w) (1) Jd/2-l(2a2 w) 



(3) '-'d/2 



2a!/4-l/2 



2d/2-l 



(15) 



The selection of rjQ based on Eq. (fT^ increases the number of SSRF constraints to four; 
the other three are given by the equations pi|). ()14|) and ()15p. Thus, the number of 
parameters matches the number of constraints. 

The steps ai and 02 depend on the sampling point distribution. Their selection is dis- 
cussed in Section flVl|) below. In general, 02 is different from ai. To incorporate the spatial 
modelling of data from non-differentiable distributions, one should focus on the quantities 
0i(o.i) = of E[S'i(ai)] and 02(^2) = 03 IE[S'2(a2)] instead of E[S'i(ai)] and E[S'2(a2)]. 



V. SAMPLE CONSTRAINTS 



We formulate sample constraints that provide 'well-behaved' estimators of the model con- 
straints defined above. We emphasize that the following sample estimators of the generalized 
gradient and curvature constraints are not restricted to the FGC-SSRF model. 

The variance constraint is local, i.e., it does not involve differences between neighboring 
points. Hence, if the distribution of the sampling points is uniform, it is sufficient to use the 
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classical variance estimator. To estimate a non-zero mean one can use the sample average, 
rhx = view of which the sample variance Sq is given by: 

cSo = -5^[X*(s,)-m.]'. (16) 

i=l 

Declustered estimates of the mean and the variance can be used if the sampling point 
distribution is non-uniform, in order to obtain unbiased estimates of the variance. However, 
non-ergodic fluctuations, which often dominate the estimation of the variance from a single 
sample, are not significantly reduced by cell declustering. Another possibility is using the 

I I 

kernel-based variance estimator [8], which has improved convergence properties. However, 
we are not aware of a systematic method for selecting the kernel bandwidth for the variance. 



A. Kernel Averages of Sample Functions 

To define sample-based generalized gradient and curvature constraints we use isotropic 
kernel functions K{r) with suitably selected bandwidth parameters, hi (for the gradient 
estimator) and /i2 (for the curvature estimator). The selection of the bandwidths is guided 
by consistency principles that link them to the respective steps, ai and 02. 

The kernel is a bounded, positive, and compactly supported [0, R] function. Hence, 
the moments 

mKj= / dss^-^K{s), (17) 



and 

-R 

(2) 



dss^-^K\s). (18) 
are finite for all j G In addition, we define the kernel moment ratio: 







!!^E!i±P_ (19) 

mK,d 

In practice non-compactly supported kernels (e.g., the Gaussian kernel,) that decrease to 
faster than polynomially work just as well as compactly supported kernels. 

The following notation is introduced to facilitate calculations with kernel averages. For 
Si,Sj G Sm, Sjj := Si — Sj will denote the distance vector, and Sij := ||sjj|| its Euclidean 
norm. 
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The abbreviations Wij :— K(sij/hi), and Qi,j{r) '■— K(sij/rh2) where r = 1, 2, -\/2 will 
be used for the kernel weights. The weights Wij and Qij{r) are random variables, due to 
the variability in the sampling positions. 

The symbol Yl'ij ^^^^ denote a summation over both position indices i and j excluding the 
diagonal terms i = j. Similarly, the triple summation Yl[jf, and the quadruple summation 
y^!- h I will exclude all the terms in which at least two indices take the same value. 

Given kernel bandwidths hi and /i2, and a two-point sample function — A{X*,X*) 
or a function of sampling positions Aij — A{si,Sj), where i,j — l...,n, the off-diagonal 
kernel-weighted averages will be denoted by: 

K,AAj}:=Y!--^ijAj, (20) 

Kr/i2 {Aij} := ^ .Qi,j{r) A'ij- (21) 
The normalized kernel average of A^j is defined by means of the equation: 

A, := ^\ ; ■ (22 

More specifically, we will denote the sample increment SRF by means of 

X:y.^X*{s,)-X*{s,). 

The sample function that represents the kernel average of X*j with a kernel bandwidth h 
will be denoted by 

Mh):={[X:f)^. (23) 

The random variable fx{h) incorporates variabihty due to both X* and the samphng posi- 
tions. 



B. Definition of Sample Gradient and Curvature Estimators 

The notation introduced above is now used to define sample estimators for the squares 
of the generalized gradient and curvature. 
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Definition 3 The sample average of the square generalized gradient is defined as follows: 

(1) (1) /, X 

5r(aO:^|.(ra^),.,^|.S'.):^^. (24) 

where ^(/ii) =dfx{hi). 

The bandwidth hi is related to ai by means of the following consistency principle: 

= {^h),i ■ (25) 

The step-bandwidth dependence introduced by the consistency principle is physically 
motivated, because only ai represents an actual length scale. By adopting ()25|) . the kernel 
average in is forced to focus on points separated by distances controlled by the step 
value. This makes sense for calculations of generalized gradient and curvature terms. The 
sample constraint defined in (f^^ is analogous to the respective stochastic constraint in (jH)). 
In addition, the consistency principle ensures that, for differentiable SRFs, Si{ai) is an 
unbiased estimator of E[5'i(ai)] (see below). 

We introduced the three related quantities, fx{hi), Tplihi) and 5i(ai) for the following 
reasons: If the observed SRF can be considered differentiable, the generalized gradient 
constraint Si{ai) is well defined at the limit ai 0. If the observed SRF is continuous 
but non-differentiable the limit of Si{ai) as ai — does not exist. In this case, it makes 
more sense to work with the kernel-averaged square increment, Tpi{hi). To simplify the 
accounting it is often advantageous to work with the square increment per direction, 
fx{hi); in the isotropic case (pi{hi) and fx{hi) are simply proportionally related. Similar 
comments apply to the case of the generalized curvature constraint. 



Definition 4 The sample average of the square generalized curvature is defined as follows: 



^52(02) 



cfliiih^) fx{h2) - cf^,2{h2)fx{V2h2) - cffx{2h2) , (26) 



.(3) 



(1)- 



^2 

the constants /Ui(/i2), A*2(^2) o,re given by the following averages: 



.(3) 



/il(/l2 



/i2(/i2) (4j> 



.(1) 



.(2) 



(27) 
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/^2(/^2) = ^-^jf. . (28) 

ViJ/^h2 d \^id/h2 /e2 \ 

y^id/h2 

The bandwidth h2 is linked to the step by means of the consistency principle: 

4 = ' (29) 

The sample constraint 52(02) given by includes three terms that correspond to the 
terms in the respective stochastic constraint Q- The coefficients /ii(/i2) and 7x2(^2) in ()26p 
incorporate the impact of the sampling network topology and the kernel function used. As 
shown in Lemma ((Zj), Section ()VI D|l . the coefficients /ii(/i2) and ^2{h2) are approximately 
equal to 1. Their precise form is selected to ensure that, in the case of differentiable SRFs, 
the generalized curvature constraint is asymptotically unbiased. 



VI. ASYMPTOTIC PROPERTIES OF CONSTRAINT ESTIMATORS 



The asymptotic limit corresponds to n — > 00. At the limit it is assumed that l/|f2n| 0. 
In order to establish the asymptotic properties of the sample estimators for the generalized 
gradient and curvature, we first present some formalism and the conditions required for the 
validity of the proofs. 



A. Formalism 

The following notation will be used in the proofs of asymptotic behavior. 

1. The sampling locations will be expressed where L„ oc is the 

characteristic domain scale, and Uj denote the realizations of the random vector 
U, D [0, l]'^. 
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2. For any vectors Vj, Vj, the pair distance will be denoted by Vjj := Vj — Vj, and its 
Euclidean norm by Vij ||vi — Vj||. 

3. In integrals of kernel averages, the distance of the normalized sampling locations will 
be denoted by a; := Uj — Uj. 

4. A vector uj will be expressed in spherical polar coordinates as 

u)=uju^ and (ci>)j = cos6'i |T sin6'j 

where = 7r/2, = 0, 9d-i G [0, 27r), and 9i e [0, tt) for i = 1, . . . , d-2. The Jacobian 
of the transformation is given by uj^''^Jd{d) where 6 — (^i, . . . , 9d-i) and 

Jd{d) = (sin 9i)'^~'^ (sin 6*2)'^"^ . . .sin 9d-2- 

The area of the d— dimensional unit sphere will be denoted by: 

Ad := / deJdiO), 

w^^ere jg^ Jo ■■■Jo Jo ■ 

5. The following aspect ratios will be used as small perturbation parameters: Pn '■= hi/ 
and qn := h2/Ln. 



B. Conditions 

The following conditions will be assumed to hold: 

1. The normalized location random vectors Ui, . . . , U„ are assumed to be independent 
and identically distributed. 

2. The probabihty density function (pdf) /i(uij) of the samphng- location pair-distance 
vector is continuously differentiable in a neighborhood of zero. 
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3. The joint pdfs /2(uij, Uj^fe) and /3(ujj, Ui^k, Uj,;) are also continuously differentiable in 
a neighborhood of zero. 

4. The conditional pdf gu{uij\uj — u) is uniformly bounded in u. 

5. The joint moments of X* are identical to those of Xx- For example, V s^, Sj, with 
i ^ j, F*{si - Sj) = Fx{si - Sj). 

6. The model semivariogram Fx is continuous in a neighborhood of zero. 

7. There exists a continuous and bounded function ipx such that 

Cov [{X*^)', = ^x s,,„ s,,„ s,,,) . (30) 

For example, if Xx is a Gaussian SRF with semivariogram Fx, 



ipxiui, U2, U3, U^) = 2 Fx{Ui) + Fx{u2) - Fxius) - Fx{u4) 



2 



8. If e << 1, there exist Ci > 1 and three continuous functions gi, §2 and such that: 
and 



V'A(0,0,es,es) = g'i(s) ( J ) +0(7 



^A(esi, es2, 0, ess) = g2{si, S2, S3) ( 7 ) +0(7 



^/'A (m3, ||e'*^2 + Watt's - e<^i||, \\uz(^z + e'*^2||, Hms^s - ewi 
5^3(1*3, a;i,a;2, 01, 02, ^a) ^ (f) ' 

For example, if Xx is a Gaussian SRF with a spherical or exponential covariance 
function, the above conditions hold with Ci — 1. In the case of differentiable covariance 
models commonly used (Gaussian, hole-type, rational quadratic, Cauchy) one has 
ci = 2. 
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9. The following integral of the function is bounded: 

10. The bandwidths hi and /i2 tend to as n tends to oo. 

11. At the asymptotic limit, tends to as n tends to oo. This condition is 
satisfied simultaneously with l/|n„| — > 0, if oc and < 5 < 1. 

Conditions (Pi"® specify properties of the sampling point distribution. Condition (0) 
expresses the correspondence between the SRF model and the sampled data. Conditions 
dni)-© specify properties which are satisfied by default for FGC-SSRFs. They are explicitly 
stated here, because the convergence properties of the constraint estimators are proved for 
more general cases, including non-Gaussian SRF models. In particular, conditions ((Tj)-® 
are used in the analysis of the sample constraints variance. Conditions (fTmi -illlj ) imply an 
asymptotic densification of the sampling network, since the area enclosed by the convex hull 
increases slower than the number of points. For regular grids, this condition is obtained if 
the spacing decreases as the number of nodes increases. The densification conditions are 
necessary for proving asymptotic convergence of the estimators. 

Lemma 1 // the above conditions hold, the following is true: 

Pr ( lim Kh, {A J = n^E [W,,, A J ) = 1, (31) 

V ra— >oo / 

where the indices i,j refer to any pair of non-identical sampling points. Similarly, if the 
summation is over a weighted k — point {k G Z) non-diagonal function, the result is ccn^. 

This ergodic result follows directly by applying the arguments in the proof of Theorem 3.1 
in Hall et al. Isl], which will not be repeated here. Equation (|31|) enables the calculation of 
sample kernel averages in the asymptotic limit. 
We will also use the following lemma: 

Lemma 2 Let he a sequence of uniformly hounded random variahles such that X„ = o(l) 
almost surely. Then ¥\Xri\^ = o(l), for any k. 
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C. Generalized Gradient Estimation 



In this section we prove a relation between the "gradient" step and the kernel bandwidth 
hi, and we propose a physical estimate for the step. We also investigate the asymptotic 
properties of the mean and variance of the generalized gradient estimator. 

Lemma 3 The following relation holds between the bandwidth hi and the gradient step ai : 

1 /2 

tti = hiB2 +0{hiPn) a.s., (32) 
where B2, is the kernel moment ratio defined in My\) . 
Proof: 

Based on the consistency principle (|^. the step ai is expressed as follows: 

al = ^'^ (33) 

The above can be calculated explicitly in the asymptotic regime using Lemma (^. 
Leading-order calculation o/E[PVjj]. 



nWi,,]= j dujK{L^\\uj\\/hi)fi{u). 



The dominant asymptotic contribution from the integral is evaluated by means of a Taylor 
expansion of the pdf /i in terms of the small parameter p„ using the condition (j21), i.e., 

= I dujoo^-'Kiuj/pn) [ dO Jd{e)fi{ujLb) 
Jo JSd 

= pi r duu^-^Kiu) [ de j,{e)fi{p^uu) 

= ptA,fi{0)mK4 + O{pt''). (34) 

This expansion gives the asymptotically dominant term of E[PVjj]. Based on Lemma it 
follows that: 

J]' W,,, = n'pi Aa /i(0) mK,d + O (n^pf i) a.s., (35) 
hi 
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where is the area of the c?— dimensional unit sphere defined in paragraph of the 
Notation subsection. 

Leading-order calculation of E [Wij sf^'j . 



E[W,,sl] = LlE[slK{p^U,j)]=Ll J du: ||a;f (||a;||/p„) A(a;) 
= L2pfM,/i(0)mK,.+2 + O(L2pf3). 



(36) 



Hence, from Lemma (Q) and equation (|36|) it follows that 



Y^W,, si = n'Ly+'A,h (0) mK4+2 + O {v?Ll p^^') a.s. 



(37) 



Based on (j^Hj) . (jH^j) . and (jHTf) the asymptotic behavior of the step ai is given by: 



aj = h\ B2 + 0{hipn) 



a.s.. 



1. Selection of Distance Step 

Lemma is valid for any step ai. We define by ?Bo the set that includes for every 
sampling point Sj the distance vectors from all its near neighbors Sj, and also Nq = |53o|. 
A sensible estimate cii is the geostatistical d-power average of the Euclidean distances, 
Ap = \\si — Sj\\, of all the vectors in 5Bo i.e.. 



This definition implies that di is a random variable that depends on the sampling point 
configuration. In connection with the consistency principle, the bandwidth hi is also a 
random variable. However, since di represents an average over all the near neighbor distances 
for all the points, its fiuctuations are not very significant. In particular, the coefficient of 
variation declines with the number of sample points. To avoid cumbersome notation we will 
not distinguish between h"^ and a^, m G Z+ and the respective stochastic moments in the 
following theorems on the asymptotic properties. 

Remark 2 Other estimators of the distance step such as the median or the root mean square 
neighbor distances can be used. However, the equation ()38|) leads to consistent convergence 
properties for the variance of the sample constraints, regardless of the spatial dimension. 




(38) 
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The kernel bandwidth, hi := /ii(ai), is then given in view of ()32|) by 



hi = di B~^'^. (39) 

The above gives an exphcit hnear solution for the bandwidth in terms of the step. In 
practical applications p„ is a small parameter, and (jH^ is sufficient. Alternatively, can 
be solved numerically to obtain the bandwidth in the pre-asymptotic case. 

Lemma 4 Let us assume that the sampling network densification conditions I[1U\} and ill]) 
hold, i.e., \VLn\ oc , where < 5 < 1 and hi oc n^'^ for every realization of the sampling 
network. Then, if the gradient step is defined by the equation ^EE), the bandwidth exponent 
satisfies the inequality < 7 < {1 — 6)/d. 

Proof: It holds that J2pli > \^n\ (where Vd is a geometric constant that depends 
on d). Since A^o > in light of equations (jHH|l and ()39|1 it follows that h'l > v'^ ^n~'^'^~^^ for 
any n, which implies that (i 7 < 1 — 5. 

Theorem 1 (Mean of the Sample Gradient Constraint - Differentiable Case.) Assume 
that conditions n\]- Ml\) above are satisfied, and that F\ is four times differentiable in a 
neighborhood of zero. Then Tpl{hi) is an asymptotically unbiased estimator of the stochastic 
constraint 0i(ai). More specifically, the following holds 

E [Tp{{hi) - 0i(ai)] =dT2h\ {B, - El) + 0{hlpr,) + o{h{), (40) 
where T2 = Ff^(0)/2. 

Proof: Since Tp^i^hi) = d fx{hi) and 0i(ai) = d Fx{ai) we focus on fx{hi) and Fa to 
avoid unnecessary clutter. The sample function fx (hi), defined in (j2SI), is expressed as 
follows in light of equations (pUj) and : 

K[fx{hi)] involves an expectation over both the sampling point distribution and the distri- 
bution of the field values. Hence, we can write 

E[Mhi)] =E{E [Mhi)/Vi, . . . , U„] } , 
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where the inner (conditional) expectation is over the field values keeping the sampling loca- 
tions fixed, whereas the outer expectation is with respect to the sampling point distribution. 

Calculation of the Conditional Expectation E /Ui, . . . , U„] . 

Since only the numerator of (PT|) depends on the field values, we obtain 



E[7^(/ii)/Ui,...,U„] 



,,{f,(l„u,,,)} 

K;,, {1} 



(42) 



Leading -order calculation o/K/jj-j Fx(L„Ujj' 
Using Lemma (Q) we obtain 



E 



W, 



{^a(^ 



U 



rfa;ir(||a;||/p„) (L„||a;||)/i(a;) 



dujuj''-^K{uj/p:)F^{Kuj) I dO J,{e)h{uju) 
p^A,[/i(0) + O(p„)] / duu''~'K{u)Fx{hu). 



Since the kernel is compactly supported, it is possible to approximate F\{hiu) with a Taylor 
series expansion around zero, i.e.. 



Fa(/IiM) = T2 U'hf + n U%\ + 0{h\ 



(43) 



where Tj = F^\{}i)/i\^ i = 2,4. Inserting the expansion in the integral it follows that 



Wij {Fa (L„ U,,,) I = p^A, [/i(0) + 0{pn)] [t2 hlmK,d+2 + U h\mK4+i + o{h\)\ . 



E 

Finally, based on the above and Lemma ((H), it follows that 

K,,{F,(L„U,,)}=n2p^A4/i(0) 
Using this equation in connection with ()35p and ()4ip leads to 

E [:^(/ii)/Ui, ...,\]n]=r2h\B2 + n h\ B, + 0{h\pn) + o{h\) a.s. (44) 

The respective expansion for F\{ai) is obtained using the consistency principle, (jH21), as well 
as equations and (jSHI), i-e.. 



Fx{ai) = T2 hi B2 + r4 h\ B^ + 0{h{pn) + o{h\), a.s. 



(45) 
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From the equations ()44p . ()45|) and Lemma Q, it follows that 

E [f^ih) - FA(ai)] = n h\ {B, - Bl) + 0(/i?p„) + o(/it). (46) 

The asymptotic convergence then follows from the densification effect, i.e., from 7 > 0. 
In light of the above, fx{hi) is an asymptotically unbiased estimator of F\{ai), since the 
difference E — -^A(cti) converges to faster than each component as hi 0. 

If we consider fluctuations in the bandwidth and the step, hi on the right hand-side of 
equation should be replaced by the respective mean value, and the corrections should 
also include bandwidth fluctuations. 

Remark 3 The asymptotic decline of the bias as hf follows from the consistency principle 
and does not require the specific choice of the step (jHH|l . The latter may only influence the 
upper bound of the bandwidth exponent 7. 

The following is also a direct consequence of Theorem and Lemma Q : 

E[5r(ai)] - E[Siiai)] = n hi (^|^ - + 0(j9„) + o{hl). (47) 

Equation ()47|) shows that the generalized gradient 5i(ai) is an asymptotically unbiased 
estimator of the stochastic constraint E[S'i(ai)]. 

Theorem 2 (Mean of the Sample "Gradient" Constraint - Continuous Case.) Assume that 
conditions /f7))-/ lii|) above are satisfied, and that Fx is continuous but non differentiable at 
zero. For a'l = hiBi, it follows thatTpi{hi) is an is an asymptotically unbiased estimator of 
the stochastic constraint 0i(a'^). More specifically: 

Ep-i{hi) - 0i(a;)] =dT2hl {B2 - Bl) +dnhl (53 - Bl) + 0{hipn) + o{h\). (48) 

In addition, Tpl{hi) is an asymptotically biased estimator of the stochastic constraint 
0i(ai), i.e., 

E[lp-i{hi) - 01 (ai)] = dTihi (Bi - bI'^'^ +dTshl (53 - 52'/') + 0{hipn) + o{hl). (49) 
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The mean relative error of K[ipi{hi)] is given by 



ipijhi) - 0i(ai) 
01 (ai) 



-Bl,2 



hi 



ra B : 



3,2 



+ 0{pn)+0ihl), 



(50) 



where Bi^2 = Bi — B^"^, and B^^2 = -B3 — -82^^- 



Proof: The logic of the proof is the same as in Theorem and therefore we only present 
the main points. The derivatives of Fa do not exist at zero. However, if Fx admits at least 
third-order derivatives for any hiu > 0, the Taylor series expansion of the semivariogram is 
expressed as 



Fx{hiu) = Tiuhi + T2U^hl + T^u-^hl + 0{hiPn) + o{hl) 



where = F^\o^)/i\. Then we obtain 



E 



and in connection with ()35|) and ()41|) it follows that 

3 

E[/^(/ii)/Ui,...,U„] =J2'^jh^B,+o{hl) a.s. 

i=i 

Based on (jHH), the semivariogram Fx{a[) is expressed as 

3 

Fx{a\) = J2 '^J B{ + 0{h,p^) + o{h\). 



Hence, we obtain 



(51) 



(52) 



(53) 



E[7^(/ii)/Ui,...,U„] -F,(a;)= Y,T,y^{B^-B[)+0{h,p^) + o{h\) a.s. (54) 

i=2,3 

The above proves equation ()48p. The equation ()49p is proved in the same way, but the 
expansion is replaced with an expansion around ai, i.e.. 



Fx{ai) = n hi B^^ + T2 h\ B2 + rg hf B^ + o{h\). 
The above, in connection with ()52|). leads to 



(55) 



E[/^(/ii)/Ui,...,U„]-FA(ai)= 5^r,M {b, - B^"-^ ^ Oih^Pn) ^ o{h\) a.s. (56) 



i=i,3 
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Finally, equation ()50|) for the mean relative error (relative bias), follows from ()55|) and ()56p . 
Based on equation (j^Ujl the relative bias depends on the kernel function through the 

1 /2 

coefficients B12 and i?3_2. As hi — > 0, the relative bias converges to B12/B2 . 

Lemma 5 The asymptotic relative bias, tp^,!, is a non-positive number. 

1/2 

Proof: B2 is a positive number. By definition, B12 = Bi — B^ ■ Let us define the density 
function 

d-l 



fKis):= Jff' ^ ^ , se[0,R]. 



s s" 



In light of this definition and equation p9|). we obtain Bm = Kk[s"^], where Ek denotes the 
expectation with respect to the density function /x- Then, B2 — BI = Ex [{s — Ex[s])^] > 0, 
and thus B12 < follows directly. 

As a direct consequence of equations pTIjl and (jH^ . one obtains that E[i5i(ai)] oc 0(/i|f^). 
Hence, the sample function Si{ai)] is not well defined at the asymptotic limit. Thus, the 
"gradient" constraints in the continuous but non-differentiable case refer to the sample 
function ^(/ii) and its stochastic counterpart, 0i(ai). 

Theorem 3 (Variance of the Sample "Gradient" Constraint.) // the conditions /f7))-/ lii|) 

above are satisfied, Tpiijii) is an asymptotically consistent estimator o/0i(ai). In particular, 
the variance ofTpi{hi) is given asymptotically by: 

Yarmh)]=o(^-^^^^y ei = min{5,2-5-ci7}- (57) 
Proof: The variance of fx{hi) is given by means of: 

Yarlhihi)] = E [var [7^(/ii)/Ui, . . . , U„]] + Yar [e[7^(/ii)/Ui, . . . , Uj] . (58) 

According to Eq. (j44|) in the differentiable case, and Eq. (j52p in the non-differentiable case, 
the second term on the right hand side of Eq. is o{hl). Hence, we focus on the first 
term, which is expressed as follows: 



Var|/x(fti)/Ui,,,,,U, 



W,,jWk,i 
[K;,,(l)]' 
Via + ^1,2 + V^i,3, 



(59) 
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where the functions Vi,i, Vi,2, ^1,3, in hght of ip\ defined in (IHUjl . are given by 



Vi, 



Yl'i,j,k,l^i,j ^\ {Si,k, Sj^l, Si^l, Sj^k) 



(60) 
(61) 
(62) 



Leading- order calculation of the denominator. 

The quantities Vi^i, Vi^25 ^1,3 in equations (jHOJ-linS) have a common denominator, the 
asymptotic behavior of which follows from (jH^j) . More precisely, the following is true: 

2 



— J 



(63) 



Leading-order calculation of Vi^i. 

Denote the numerators of (jUm) -()62 |) by ,j = 1, 2, 3. Then, it follows from Lemma 
that N^^'l = 2n^ n[^^ almost surely, where : 

= J da;/^2(||a;||/p„)^;,(0,0,L„||a;||,L„||a;||)/i(a;). 

We use the variable u = uj/pn, and a Taylor expansion of /i around zero. We evaluate the 
integral over u with the mean value theorem. Finally, we apply the first scaling property of 
ipx in condition (jH)), to obtain the following 



J Jsa ^ 

= pUliu*) (I) \,fm jyuu'~^K\u) + o{pihl^^). 
Hence, it follows that N^l is given by 



Nt}=2n'plg,{u*) 



2 „d 



2ci 



A,fm^td + o{n'p'ihf^) a.s 



2 T 2C1^ 



(64) 



Finally, from equations (jU^ . and based on Lemma (H)), it follows that 

.(2) / rrf \ / L'* 



^1, 



m 



K,d 



n2 ht^'' 



a.s. 



(65) 
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Hence, the asymptotic dependence of Vi,i on n becomes 



y,, = 0(n^-2+^'^-2ci7). (66) 
Leading-order calculation of Vi,2- 

The numerator of Vi^2 is equal to n[^2 = n[^2 almost surely, where: 

= jj du;iduj2 K (^^^^^ K (^^^^^ ^ 

Converting u^i and 0^2 to spherical polar coordinates, using the perturbation parameter 
with the change of variables Ui = ooi/pn, M2 = ^2/Pn leads to: 

= Pn I duiui~'K{ui) I du2ut'K{u2) n / de.Ue,) 

J J i = l,2 "^^d 

tpX {LnUJ2, LnUJi,0, Ln\\uJiU}i - UJ2<^2\\) /2 (^^1 <^l,UJ2i^2) ■ 

We evaluate the integrals over uji and 0^2 using the mean value theorem, defining Mi 2 •= 
a)]" — M2 C(>2||- By applying the second scaling property of condition (jHl) for ipx, the 
following is obtained: 



= 'lpx{hlU2,hiUi,0,hi\\uiU:i-U2UJ2\\) f2{PnUlLdl,PnU2U:2) 

= g2{u*2,ul,0,ul2) Ij] pl^(mK,dAd) /s (0) + o (p^^^ /,2ci) 



Hence, the following expression is obtained for the numerator of Vi^2'- 



= "^n'PuK^^lf^i^) 92{u*2,ul,0,ul2) (j'^ 



I) 

+ o(p^'^n=^/^^) a.s. (67) 



Finally, based on equations (fHHjl . (|Fr7j) and Lemma ((H), the following asymptotic ex- 

pression is obtained for Vi 2 

V'u = [—)+"[—) (88' 

Therefore, the asymptotic dependence of Vi,2 on ^ becomes 

^^_2 = 0(n-i-2ci7). (69) 
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Leading- order calculation of Vi^s. 

The numerator of Vi^s, n[^^, includes a summation over quartets of sampling points and 
thus involves the joint pdf of three independent distances (\Jij,\Jk,i,Vi^kj . For reasons 
of brevity, we denote Uij = iVi, u^/ = (jJ2, and Uj^^ = uj-f, then u^^; = + <^3, and 
Uj^i = UJ2 + (ji>3 — <^i; also Ui^i = \\uj2 + (^all and Uj^i = \\uJ2 + UJ3 — 

According to Lemma (Q), n[^1 = n'^N^^^ almost surely, where N^^'^l is given by 



A^^^3 := E K {Uij/pn) K {Uk,l/Pn) i^x{LnUi,k,LnUj^l,LnUi^l,LnUjj 

du)idu)2du)3 K [ il^^:^ ) K [ l^^ili ] (cj^, 0^3, u)2 + (^3) 




Vn J \ Pn 
IpX {LnUJs, LnUj^l, LnU^, L^UJi's) . 

Converting cJi, 0:2 and u}^ to spherical polar coordinates, the following expression is obtained: 

3 3 

i^lJ \PnJ \Pn/ .^-^Js^ 

ifjX {LnUJs, LnUj^l, LnU^, L^UJi-i) {uJiUJi , 1^3^03, UJ2U)2 + t^3'^3) • 

Using the variable transformations Ui = uji/pn, U2 = uj2/pn, U3 = LnUJ3, and the Taylor 
expansion of /s around (0,0,0), N[^^ is transformed as follows 

= ^ / / J du3{u^U2U3Y''Kim) Kiu2)llJ^ d9,,Udi) 

V'A (^^3, \\h1U2UJ2 + U3UJ3 - hiUiU}i\\, \\U3UJ3 + hiU2UJ2\\, 
||«3^3-Ml^l||) [/3(0,0,0) + O(l) 

The integrals over ui, U2 and 6i are evaluated using the mean value theorem and the third 
scaling property of condition (jH}: 

^JS = L^lf) /3(o)(m^,.A,) j du3ui-'g3{u3,ui,u*2,ei,e;,0;) 

MPnL-'h'n. 

Finally, the following expression is obtained for A^}^^ 

<3 =C*n' {0 (f)''' /3(0) imK,,A,f + oipin-'^hr^l (70) 
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where C* = J du^ uf~^ g-s, (M3, m^, ^Jl; ^2; ^3) a finite constant thanks to assumption ©. 

Hence, based on equations (jSZI), (jEBl), and Lemma the following asymptotic 
expression is obtained for 3 



Variance Convergence Rate. 

Based on equations (jHSl), (EH) and (f7^. the convergence of Vi,3 is slower than that of 
Vi^2 since 5 < 1. The convergence of Vi^i is faster than that of Vi^s if •y d < 2(1 — 5). If this 
condition holds, then Vi^3 is the rate-limiting term. In light of Lemma (0} this inequality is 
satisfied for the bandwidth defined by (jHE))- 

Remark 4 The rate of convergence of the gradient estimator's variance is the same for the 
differentiable and non-differentiable cases. The three terms, i.e., ^1,1, Vi^2; ^1,3? possess dis- 
tinct convergence rates. These terms correspond to sample functions that involve doublets, 
triplets and quartets of non-identical sampling points. Using the step estimate (fH^ and the 
consistency principle, the slowest convergence rate (asymptotically dominant term) is due 
to the term that involves quartets of non-identical points. On intuitive grounds, we would 
expect the same behavior to hold for different step estimates. 

D. Generalized Curvature Estimation 

In this section we prove a relation between the "curvature" step and the kernel bandwidth 
h2, and we propose an estimate for the step. We then investigate the asymptotic properties 
of the mean and variance of the generalized curvature estimator. In the process, we also 
show that to a first approximation yUi = /X2 = 1 and we calculate the asymptotic dependence 
of the leading corrections. 

In the proofs of asymptotic dependence, we will use the following modification of 
Lemma (QJ. 




a.s. 



(71) 



Hence, the asymptotic dependence of Vi^s on n becomes 




(72) 




(73) 
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Lemma 6 The following relation holds between the bandwidth h2 and the curvature step 02.' 



a2 = h2 B4 + 0{h2qn) a.s.. 

Proof: The proof is along the hnes of Lemma Q- According to the definition ()29p and the 
kernel-average equation the step 02 is defined by: 



Leading-order calculation of K[Qij(^r)]- 

nQ^,Jir)] = jdujK{Lju:\\/rh2)f\{u;) 

= [ duJu''-'K{uj/rqn) [ dO UO) fi {lucj) (75) 
= /g^^/i(0)m;,,, + O(gfi). 

Hence, we obtain 

Krh, {1} = r'^ A, /i(0) tuk^ + O (n^ a.s.. (76) 

The term {sfj} is a special case of K^fej {"^i^}) which we evaluate below. 

Leading-order calculation of E [Qij(r) s^] . 

= r^+"^ A, /i (0) mK4+m + O (L^T • 

Hence, it follows that 

From the Eqs. ()76|) and ()77|) it follows that 



= ^'\:f = ^"^ ^™ ^™ + 0{h^ ^n) a.s.. (78) 



The asymptotic behavior of 02 is obtained from (|78|) for r = 1 and m = 4: 



a^ = hlB^ + Oih^qn) a.s.. (79) 
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The coefficients ^i{hi) and ^2{h2) appear in the definition of the generahzed curvature 
constraint. We calculate the asymptotic dependence of these coefficients. 

Lemma 7 The coefficients ^i{hi) and /i2(^2) ore given asymptotically by: 

fij{hj) = 1 + o{l) a.s., for j = 1,2. 

Proof: Based on the equations (jTTj) and (j2Hl) the coefficients involve the averages {sfj)rh2 
and {sfj)rh2j where r = 1,2, ^/2. Both averages are given by equation (fTSj) . The lemma is 
proved following straightforward but tedious algebraic manipulations. 

For the curvature step we will use the same expression as for the gradient step, i.e., 
a2 = ai, given by equation (jHEI)- The kernel bandwidth, /i2 := h2{a2), is then given in view 
of jZni) as follows: 

h2 = a2 Bl^'\ (80) 

Lemma 8 Let us assume that h2 oc . Then, if the curvature step is defined by the 
equation the bandwidth exponent satisfies the inequality 0<i'd<l — 6. 

Proof: The proof is completely analogous to the proof of Lemma Q if 7 is replaced by u. 

Theorem 4 (Mean of the Sample Curvature Constraint - Differentiable Case.) Assume that 
hypotheses /f7))-/ lii)) above are satisfied, and that Fx admits five derivatives in a neighborhood 
of zero. Then ^(/i2) is an asymptotically unbiased estimator 0/02(02)- More specifically, 
the following holds 

E mh2) - 02(02)] = -2M{d + 4) re (b^ - B'J^^ + 0{hl g„) + o{hl), (81) 
where Tq = and gt = mK,d+k/mK,d- 
Proof: 

Based on (pUj) . 02(02) is expressed in terms of fx{rh2) as follows: 

^(^2) = I [cf/ii(/^2) Mh2) - cf^f^2{h2)lx{V2h2) - 4' ^7x(2/i2)] • (82) 

The sample function fx{rh2) is defined in terms of (|^. and it is expressed in light of (PT|) 
as follows: 

Mrh2) = (83) 
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Hence, E[(y92(/i2)] is expressed in terms of E[/x(r/i2)]. As in Theorem the ensemble 
average imphes E[/x(r/i2)] = E {E [7^(r/i2)/Ui, . . . , U„] } . 

Calculation of the Conditional Expectation E \Jxirh2) /Ui, . . . , U„] . 
Only the numerator of (jH^ depends on the field values, i.e.. 



E[/x(r/i2)/Ui,...,U„] 



Krh, {1} 



(84) 



Leading- order calculation of K. 



rh2 ^^X 



v(-^nUij) ^ . 



E 



Q„-M|F,(L„Ui,,)}] = j du:Ki\\u:\\/{rq^)) /iM 
duJUJ^-'Kioo/irq^)) FxiLnOo) [ d9 UO) fi (uu;) 

= r'^g^A4/i(0)+O(p„)] j duu''-'K{u)Fx{rh2u). 
The sixth-order Taylor series expansion of F\{rh2u) around zero yields 
Fx{rh2u) = TsrV/i^ + r^r^^h^^ + T^r^hl + o{hl). 
Inserting the expansion in the kernel integral, it follows that 



(85) 



E 



= qtA, [/i(0) + O(p„)] \ ^^mK4+,n{rh2y + o{hl) 
The above in connection with (fTUI) for the kernel average K^hj {1} lead to: 

E [/x(r/i2)/Ui, ...,\Jn]= 3^n{r /i2)^ + 0{hl g„) + o{hl) a.s. (86) 

i=2,4,6 

From (IHUj) and (jH^ it follows that the Oih^) term vanishes if the coefficients /ii(/ii), /i2(^2) 
are defined as in equations and (j2HI)- Finally, we obtain 

E [^(/i2)/Ui, . . . , U„] = -4°) B, n hi - 24d{d + 4) B, r, hi + 0{hl g„) + o{hl) a.s. 

Using the definition of 02(^2), equation (jU)), the expansion (|H^. and the step - bandwidth 
relation, (jHUIl . a series expansion is obtained for 02 ('^2) 

02(02) = -4°V4 h^ S4 - 24d{d + 4)r6 /i^ 5f ^ + 0{hlqn) + o{hl), a.s.. 

The two preceding expansions allow calculating the bias for the curvature constraint by 
subtracting the terms on the respective sides. The proof is completed by applying Lemma 
121 to obtain equation (|HT|. 
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Theorem 5 (Mean of the Sample "Curvature" Constraint - Continuous Case.) Assume 
that conditions /f7])-/ lii|) above are satisfied, and that F\ is continuous but non differentiable 
at zero. For a'2 = /i2-Bi, it follows that Tp^{h2) is an asymptotically unbiased estimator 



of the stochastic constraint 02(fl2)- More specifically, if 



(4) 
d 



.(2) 



v/2cf -2c 



(1) 

d 



.(5) 



.(2) 



.(3) 



then: 



E [^(/i2) - 02(4)] = 4'^ rs hi {B, - Bf) + o{hl) + 0(/i2 ?„)• 



, and 



(87) 



In addition, f2{h2) is an asymptotically biased estimator of the stochastic constraint 
02(^2), i-e.. 



E [^(/i2) - 02(02)] = n /i2 - ^4^^" I + r3 ^3 - B 
+ o{hl) + 0{h2qn). 

The mean relative error of K\lp^(h2)] is given by 



>l/4 



.(5) 



?3/4 
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A,2 := E 



V92(/l2) - 02(02) 



02(02) 



Bl.4 



where Bi^^ = Bi — B^^, and -83,4 = B, 



B, 
B 



1/4 
1 

3/4 



5 



3,4 



1/4 



+ 0(pO + o(/i2), 



(89) 



Proof: 

First we calculate E [^(/;,2)/Ui, . . . , U„]. This requires calculation of 

E [/x('^^2)/Ui, . . . , U„] . The latter is given in equation (jSH). On the right hand 
side of that equation, the denominator, Krhajl}, is given by equation (f76|) . The numerator, 

according to (f73|). 



(90) 



Krh^iFx {Ln Uij)}, converges to E Qij(r) |Fa (L„ Uij) 
For /i2M > the Taylor expansion of Fx is given by 



Fx{rh2u) = Tiruh2 + T2r^u^Kl + r^r^u^hl + oQi^)- 



where = F^\{)'^). Then, we obtain by the standard procedure 



E 



Based on the above, it follows that 



E [/x(r/i2)/Ui, . . . , U„] = ^2 + C'(/i2 g„) + o{hl) a.s. 



i=l,2,3 



(91) 
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Finally, using Lemma Q and equation ()82p. we obtain the following 



E[^(/i2)/Ui,...,U„] 



cf n h2 Bi + cf n hi 53 + o{hl) + 0{h2qn) a.s. 



where the term 0{h'^ vanishes due to cancelation of the coefficients. Based on © and the ex- 
pansion the stochastic term is expressed as 02(02 = -^1 ^2) = c^f ''"1 ^2 -^i+c^^'' ''"3 ^2 -^1 + 
o(/2.2)- This expansion in connection with the one above for E [^(/i2)/Ui, . . . , U„] leads to 



The proof of equation ()87|1 is completed by applying Lemma [21 to the above result. The 
estimator is asymptotically unbiased since the bias converges to faster than either the 
sample or the stochastic constraints. 

Equations (jHH|) and (jHHl) follow along the same lines. The main difference is that the 
stochastic constraint now becomes 02(^12)5 where 02 = (-84^^) h2 according to (fHUI. 

Lemma 9 The asymptotic relative bias, ipe,2, is non-positive. 

Proof: As /i2 — > 0, the relative bias converges is a positive number. By 

definition, Bi^^ = Bi — B\^^ . Using the density function defined in Lemma (0), we can write 



Theorem 6 (Variance of the Sample "Curvature" Constraint.) If the hypotheses in\)- Ml\) 
above are satisfied, thenTp^{h2) is an asymptotically consistent estimator (f)2{a2) ■ More specif- 
ically, the following holds: 



Proof: The proof is based on the same approach as in Theorem IHl The calculations are 
more extended due to the cross-products between the sample functions /x(^2), /x(v^^2) 
and /js:(2/i2)- However, in this case we also obtain terms containing doublets, triplets and 
quartets of sampling points. Since the complications are of a trivial nature, the lengthy 
calculations will be omitted here. Using for the curvature step equation (jHH)) . the quartet 
term dominates the convergence. This term leads to the slow asymptotic decline of the 
variance as 0{hl^^ / Lf^ or equivalently as 0{n~'^^'^'^~^). 



E[^(/i2)/Ui,...,Uj-02(a2) 



cf rs hi {Bs - Bl) + oihl) + 0{h2qn) a.s. 



Bi-Bf> Ek {s^ - E^[s]}^ > 0, from which it follows that ^1,4 < 0. 




(92) 
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Triangular Quadratic Gaussian Tricube 





-0.0472 


-0.0440 


-0.1138 


-0.0390 


B2 


1/5 


1/3 


1 


22/91 




-0.1056 


-0.0762 


-0.1138 


-0.0793 


-Bl,4 


-0.1170 


-0.1056 


-0.3030 


-0.0959 




1/14 


1/6 


2 


22/243 




-0.2263 


-0.1653 


-0.2548 


0.1748 



TABLE I: Calculations of i3i^2i -81^4, i?2, ^4, and the relative bias of the "gradient" and "curvature" 
constraint estimators using different kernel functions. 

E. Calculation of Asymptotic Bias 

The asymptotic relative bias of the "gradient" and "curvature" constraint estimators 
obtained for different types of kernel functions, according to equations and (jEHI), is 
shown in Table (jl]). In particular, we include the Gaussian kernel, K{s) = exp(— s^), the 
triangular kernel, K (s) = (1 — ||s||) lo<i|si|<i5 the quadratic kernel, K{s) = (1 — lo<i|si|<i, 
and the tricube kernel, K{s) = (1 — ■lo<||s||<i- The Gaussian kernel is not compactly 

supported, but it decays to zero very fast. The quadratic kernel gives the lowest relative 
bias, followed by the tricube kernel. 

VII. SSRF MODEL PARAMETER INFERENCE 

Model parameter inference is based on the procedure introduced in which is expanded 
herein. The main idea is to estimate the SSRF parameters, 6 by matching sample constraints 
with their stochastic counterparts. We use the sample constraints Sq, Si{ai) and 152(02), 
given by equations (fT^ . (j^ . respectively, as well as the stochastic constraints E[S'o], 
E[S'i(ai)] and E[S'2(a2)], given by equations (fTTj). (HH), (fT3|) respectively. We also impose the 
normalization constraint (jl3|) . 

Determining 6 is then expressed as an optimization problem that aims at minimizing 
the deviation between the stochastic moments and their estimators; the latter include the 
sample-based variance, gradient, and curvature constraints, and 1 for the normalization 
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constraint. We introduce the metric $ (X*; d') to measure the distance between the sample 
and ensemble constraints: 



(93) 



^52(02) E[^i(ai) 

In equation (|^. E[S'o], E[S'i(ai)] and E[S'2(a2)] are the values of the constraints obtained 
for the "current" values of the Spartan parameters rji, £ and fcmax and for Si = 02 as given by 
fl38|) . The simplex search method of Nelder and Mead j2]| is used for the optimization. The 
initial parameter vector 6^^^ is updated at every optimization step. For rji the value rjf^^ = 1 
is arbitrarily chosen. The initial value ^"-^^ of the characteristic length is estimated from the 
data. The initial estimate of the characteristic length is given by ^^^^ = [Si{ai) / 32(0,2)] 
The frequency cutoff kc is chosen according ki^^ = 27ra|f ^. The final vector, 6, to which the 
optimization converges gives the optimal parameters of the SSRF model. 

Note that the functional $ (X*; 6') is independent of tiq, which can be set equal to 1 
during the optimization. The optimal fjo is obtained using the condition of the variance 
independence from kc^ and rji, i.e., equation (fT^ . which leads to the following: 

r)o = 2^7r'^/2p(^/2)a2. (94) 

In light of (fTT|) and (fT^ . equation guarantees that the model variance matches the 
sample variance. 

Some comments are in order regarding the definition of the distance metric ()93|). The 
functional is of the general form $ = Yl^i=i{^ " ^l^^Yi where 

°' 5r(ai)E[5o]' ^(a2)E[5i(ai)]' 

The number of terms (squares) in $ is equal to the number of variables. The Zi are functions 
that involve specific sample and ensemble constraints. The definitions of 2:2, z^^ are motivated 
by the goals of (i) eliminating the dependence on r^o (since the latter is an overall scaling 
factor) (ii) defining dimensionless variables so that the optimization does not depend on the 
units used and (iii) forming combinations of constraints of similar magnitude so that they 
contribute on an equal footing in the optimization. 
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Straightforward constraint differences, i.e., Si — E,[Si\ are neither dimensionless nor of 
similar magnitude. Using ratios iSi/E[S'i] yields dimensionless ratios of similar magnitude, 
but it preserves the rjo dependence. The proposed combinations for Zi for i = 2,3, which are 
of the form 5j_iE[S'j]/ iSiE[S'.j_i] involve ratios of the form E[S'j]/E[S'i_i], which eliminate 
the rjQ dependence. A significant advantage of using ratios 5i/E[5'j] is that 5i/E[5'j] = 
^i{hi) / (pii^ai) for i = 1,2. That is, the terms af^ in the denominators of both Si and E[S'i] 
drop out - see equations (jHj), for the generalized gradient constraint, and (fTHjl . for 
the generalized curvature constraint. For example, in the case of the generalized gradient 
constraint this means that even if the limit of fx{hi)/a\ for Oi — is not well defined 
(i.e., for non-differentiable models), the ratio iSi/E[S'j] oc fx{hi)/Fx{ai) is still well defined. 
Similarly one can show that the respective ratio for the generalized curvature constraint is 
also well defined. 

Larger values of the exponent /3 give smaller values of the distance functional for the same 
number of iterations. The results for the SSRF parameters do not depend on f3. Hence, f3 
is a handle on the convergence rate of the optimization and can be set to one. 

Multiple "solutions" of the minimization problem for the model parameters can not be 
ruled out. The distance functional has by definition a single solution in terms of the Zi. 
However, since the dependence Zi{rii,^, kc) is nonlinear, more than one solutions for {rji, C,, kc) 
may be possible, or the optimization algorithm may get trapped near local minima. It is 
acceptable to have more than one solutions corresponding to different types of "reasonable" 
spatial dependence. 

VIII. NUMERICAL SIMULATIONS 

Numerical experiments based on simulated samples are conducted to illustrate the per- 
formance of the proposed SSRF inference process. The experiments investigate the ability 
of FGC-SSRFs to model spatial distributions generated based on commonly used theoretical 
models. The comparisons are based on the covariance function and on cross-validation. 

A. Covariance Estimation 

Three covariance models are considered: 
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1. Spherical 

c.(r) = a^{l-^^ + i^}lo<„r|,<.., 

2. Exponential 

Ce(r) =(rl exp(-||r||/6e), 

3. Gaussian 

c,(r) = a^xp (-llrf/feg. 

A uniform distribution of ri = 200 sampling locations Si, . . .s„ on the two-dimensional 
domain [0,5] x [0,5] is assumed. The simulated data are denoted by X^^\si), 1 < i < n, 
where 1 < m < M, is the sample index. The data are generated from a standard Gaussian 
spatial random field (zero mean and unit variance) using the Cholesky LU decomposition 
method. The spatial dependence is given by the three models above, with correlation lengths 
bs = 1 and be = 0.5, bg = 1. For each covariance model M = 100 independent samples are 
obtained. Each realization differs from the others in both the sampling locations and the 
values of the field. The triangular kernel with support [0, 1] is used in SSRF parameter 
estimation for all the samples. 



(a) (b) 
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Distance lag Distance lag 



FIG. 1: (a) Spherical covariance (continuous line) and Spartan covariance estimator (broken lines), 
(c) Box plots of Spartan covariance estimator versus the spherical model. Plots (b) and (d) are 
the counterparts of plots (a) and (c) respectively for the correlation function. 
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FIG. 2: (a) Exponential covariance (continuous line) and Spartan covariance estimator (broken 
lines), (c) Box plots of Spartan covariance estimator versus the exponential model. Plots (b) and 
(d) are the counterparts of plots (a) and (c) respectively for the correlation function. 
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FIG. 3: (a) Gaussian covariance (continuous line) and Spartan covariance estimator (broken lines), 
(c) Box plots of Spartan covariance estimator versus the Gaussian model. Plots (b) and (d) are 
the counterparts of plots (a) and (c) respectively for the correlation function. 
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For each sample, the SSRF covariance estimator is calculated at 10, uniformly spaced 
intervals between and 1.2. Figure ^ displays the results for simulations based on the 
spherical model. The covariance function obtained from a single sample is shown for plot 
(a), and for the correlation function in plot (b). The latter is obtained from the SSRF 
covariance estimator following division by the sample variance estimate and eliminates the 
impact of sample-to-sample variance fluctuations. Box plots based on all the samples are 
shown in plot (c) for the covariance function and in plot (d) for the correlation function. The 
same plots for the exponential model are shown in Figure El and for the Gaussian model in 
Figure El The closer agreement is between the SSRF estimator and the exponential model. 
This is justified by the fact that in d = 3 the SSRF model for ^ oo and ?7i = 2 is 



equivalent to the exponential covariance 



121. The SSRF estimator matches the Gaussian 



covariance very well near the origin, due to the differentiability of both models. At large 
lags the SSRF model box plots exhibit considerable scatter, which is due to the fact that 
for certain realizations the optimization converges to negative rji. 

It is clear from the plots that the SSRF model does not provide a perfect match with 
the "true" covariance models over the entire range of lags. However, this is not a major 
obstacle in geostatistical applications, in which the "true" covariance is unknown. In prac- 
tice, estimation of the empirical covariance function (or equivalently the variograrn) from a 



20|. The 



single sample involves considerable uncertainties, which are difficult to quantify 
uncertainties are more pronounced at larger lags, at which the averaging procedure involves 
a smaller number of pairs. Moreover, the theoretical covariance functions merely repre- 
sent approximations of "actual" covariance functions, and thus do not have any "inherent" 
advantage over the SSRF model. 



B. Cross Validation 



In geostatistical applications, the performance of a spatial model is typically evaluated 
by its ability to "predict" measured sample values at a number of cross validation points. 
Here we consider = 110 sampling locations over the domain D = [0, 100] x [0, 100]. The 
set of 110 points is partitioned into a validation set, S^, consisting of riv = 10 points chosen 
at random, and the training set, St, including the remaining nt = 100 points. The two sets 
of points are shown in Figure lU 
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FIG. 4: Locations of the training set (circles) and the vahdation set (stars). 



One hundred independent samples are simulated from a Gaussian SRF with mean 
mx = 70 and standard deviation cr = 10 using an exponential covariance model, which 
will henceforth be referred to as the "true model". The correlation length is set to 6e = 4 
(i.e., the correlation range, where the covariance drops to 5% of the initial value is ~ 12). 
The true exponential model and the SSRF covariance estimator, obtained from a single 
sample on St, are given in Figure The behavior of the Spartan estimator follows the plots 
of Figure that is, the SSRF overestimates the true model near the origin, where it fails 
to capture the abrupt decline of the exponential. 

The performance of the SSRF covariance model is evaluated by means of cross validation. 

n 

We use the method of Ordinary Kriging, e.g., p|, both with the SSRF covariance and the 
true exponential covariance to "predict" the field values in S^. The predictions based on the 



SSRF covariance will be denoted by X^^f (sj), while those of the true model with X^^™^(s 



{m)i 



Sj G S^] m = 1,...,M is the realization (sample) index. In general, a prediction will 
be denoted by xl;"^\sj), where "t=ssrf" for the SSRF model and "t=true" for the true 
covariance. The relative prediction error is then given by 

X^)(s,)-X(™)(s,) 



4"^Hs,) 



(95) 



XM(s,.) 

In Table 2 we compare for each point of the mean relative error (MRE) 



(et,i(sj)) = ]g X]m=i 4™''(sj), and the mean absolute relative error (MARE), (et,2(sj)) 



M ^ 



M 

m=l 



Sj)\. The MRE is 5% or lower for both estimators, as expected given the fact 



that kriging is an unbiased predictor. The MARE is slightly higher for the Spartan model. 
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This is explained based on the difference between the SSRF and the true covariance function 
(see Figure El below) . Note that at S4 both models give the same results for the MRE and 
the MARE. This happens because S4 does not have any nearby neighbors, and thus the 
prediction at this point is reduced to the mean value. 

The analysis in this section shows that the SSRF covariance model performs satisfactorily, 
in terms of cross validation compared to the predictions obtained with the exponential model 
used to generate the data. 

m =70, o =10, L=100, n=200, 5=4, Tria 
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FIG. 5: Comparison of the Spartan covariance estimator (broken line) with the corresponding 
theoretical model (continuous line). 
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